Hello and Welcome to a new Journey in the vast area of Generative AI
Generative AI is changing our definition of the way of interacting with machines, mobiles and computers. It is changing our day-to-day life, where AI is an essential component.
This new way of interaction has many faces: the good, the bad and the ugly.
In this course we will sail in the vast sea of Generative AI, where we will cover both the theoretical foundations of Generative models, in different modalities mappins: Txt2Txt, Img2Txt, Txt2Img, Img2Txt and Txt2Voice and Voice2Text. We will discuss the SoTA models in each area at the time of this course. This includes the SoTA technology of Transformers, Language models, Large LM or LLM like Generative Pre-trained Transformers (GPT), paving the way to ChatGPT for Text Generation, and GANs, VAE, Diffusion models like DALL-E and StabeDiffusion for Image Generation, and VALL-E foe Voice Generation.
In addition, we will cover the practical aspects, where we will build simple Language Models, Build a ChatGPT clone using OpenAI APIs where we will take a tour in OpenAI use cases with GPT3.5 and ChatGPT and DALL-E. In addition we will cover Huggingface transformers and StableDiffusion.
Hope you enjoy our journey!
Generative AI definition, areas of applications, mappings like txt2txt, img2txt, txt2img and txt2voice
How ChatGPT works, and the underlying tech behind like GPT, Large-Scale Language Models (LLM) and Transformers
How Latent Diffusion, StableDiffusion and DALL-E systems work
Generative Adversarial Networks (GANs) and Variational Auto Encoder (VAE)
The good, bad and ugly faces of GenAI, and how to adapt to the new tech
Build ChatGPT clone using OpenAI API and Streamlit
Build NLP applications using OpenAI API like Summarization, Text Classification and fine tuning GPT models
Build NLP applications using Huggingface transformers library like Language Models, Summarization, Translation, QA systems and others
Build Midjourney clone application using OpenAI DALL-E and StableDiffusion on Huggingface
Transformers in Computer Vision (English)
Transformer Networks are the new trend in Deep Learning nowadays. Transformer models have taken the world of NLP by storm since 2017. Since then, they become the mainstream model in almost ALL NLP tasks. Transformers in CV are still lagging, however they started to take over since 2020.
We will start by introducing attention and the transformer networks. Since transformers were first introduced in NLP, they are easier to be described with some NLP example first. From there, we will understand the pros and cons of this architecture. Also, we will discuss the importance of unsupervised or semi supervised pre-training for the transformer architectures, discussing Large Scale Language Models (LLM) in brief, like BERT and GPT.
This will pave the way to introduce transformers in CV. Here we will try to extend the attention idea into the 2D spatial domain of the image. We will discuss how convolution can be generalized using self attention, within the encoder-decoder meta architecture. We will see how this generic architecture is almost the same in image as in text and NLP, which makes transformers a generic function approximator. We will discuss the channel and spatial attention, local vs. global attention among other topics.
In the next three modules, we will discuss the specific networks that solve the big problems in CV: classification, object detection and segmentation. We will discuss Vision Transformer (ViT) from Google, Shifter Window Transformer (SWIN) from Microsoft, Detection Transformer (DETR) from Facebook research, Segmentation Transformer (SETR) and many others. Then we will discuss the application of Transformers in video processing, through Spatio-Temporal Transformers with application to Moving Object Detection, along with Multi-Task Learning setup.
Finally, we will show how those pre-trained arcthiectures can be easily applied in practice using the famous Huggingface library using the Pipeline interface.
Pre-requisities
Practical Machine Learning course
Practical Computer Vision course (ConvNets)
Introduction to NLP course
Topics Covered
Overview of Transformer Networks
Transformers in CV
Transformers for image classification
Transformers for object detection
Transformers for semantic segmentation
Huggingface transformers in CV
Conclusion
What you will learn
What are transformer networks?
State of the Art architectures for CV Apps like Image Classification, Semantic Segmentation, Object Detection and Video Processing
Practical application of SoTA architectures like ViT, DETR, SWIN in Huggingface vision transformers
Attention mechanisms as a general Deep Learning idea
Inductive Bias and the landscape of DL models in terms of modeling assumptions
Transformers application in NLP and Machine Translation
Transformers in Computer Vision
Different types of attention in Computer Vision
Reinforcement Learning (English)
Hello and welcome to our course; Reinforcement Learning.
Reinforcement Learning is a very exciting and important field of Machine Learning and AI. Some call it the crown jewel of AI.
In this course, we will cover all the aspects related to Reinforcement Learning or RL. We will start by defining the RL problem, and compare it to the Supervised Learning problem, and discover the areas of applications where RL can excel. This includes the problem formulation, starting from the very basics to the advanced usage of Deep Learning, leading to the era of Deep Reinforcement Learning.
In our journey, we will cover, as usual, both the theoretical and practical aspects, where we will learn how to implement the RL algorithms and apply them to the famous problems using libraries like OpenAI Gym, Keras-RL, TensorFlow Agents or TF-Agents and Stable Baselines.
The course is divided into 6 main sections:
1- We start with an introduction to the RL problem definition, mainly comparing it to the Supervised learning problem, and discovering the application domains and the main constituents of an RL problem. We describe here the famous OpenAI Gym environments, which will be our playground when it comes to practical implementation of the algorithms that we learn about.
2- In the second part we discuss the main formulation of an RL problem as a Markov Decision Process or MDP, with simple solution to the most basic problems using Dynamic Programming.
3- After being armed with an understanding of MDP, we move on to explore the solution space of the MDP problem, and what the different solutions beyond DP, which includes model-based and model-free solutions. We will focus in this part on model-free solutions, and defer model-based solutions to the last part. In this part, we describe the Monte-Carlo and Temporal-Difference sampling based methods, including the famous and important Q-learning algorithm, and SARSA. We will describe the practical usage and implementation of Q-learning and SARSA on control tabular maze problems from OpenAI Gym environments.
4- To move beyond simple tabular problems, we will need to learn about function approximation in RL, which leads to the mainstream RL methods today using Deep Learning, or Deep Reinforcement Learning (DRL). We will describe here the breakthrough algorithm of DeepMind that solved the Atari games and AlphaGO, which is Deep Q-Networks or DQN. We also discuss how we can solve Atari games problems using DQN in practice using Keras-RL and TF-Agents.
5- In the fifth part, we move to Advanced DRL algorithms, mainly under a family called Policy based methods. We discuss here Policy Gradients, DDPG, Actor-Critic, A2C, A3C, TRPO and PPO methods. We also discuss the important Stable Baseline library to implement all those algorithms on different environments in OpenAI Gym, like Atari and others.
6- Finally, we explore the model-based family of RL methods, and importantly, differentiating model-based RL from planning, and exploring the whole spectrum of RL methods.
Hopefully, you enjoy this course, and find it useful.
Pre-requisities
Machine Learning basics
Deep Learning basics
Probability
Programming and Problem solving basics
Python programming
Topics Covered
Introduction to Reinforcement Learning
Markov Decision Process (MDP)
MDP Solution Space
Deep Reinforcement Learning (DRL)
Advanced DRL
Model based RL
What you will learn
Define what is Reinforcement Learning?
Apply all what is learned using state-of-the art libraries like OpenAI Gym, StabeBaselines, Keras-RL and TensorFlow Agents
Define what are the applications domains and success stories of RL?
Define what are the difference between Reinforcement and Supervised Learning?
Define the main components of an RL problem setup?
Define what are the main ingredients of an RL agent and their taxonomy?
Define what is Markov Reward Process (MRP) and Markov Decision Process (MDP)?
Define the solution space of RL using MDP framework
Solve the RL problems using planning with Dynamic Programming algorithms, like Policy Evaluation, Policy Iteration and Value Iteration
Solve RL problems using model free algorithms like Monte-Carlo, TD learning, Q-learning and SARSA
Differentiate On-policy and Off-policy algorithms
Master Deep Reinforcement Learning algorithms like Deep Q-Networks (DQN), and apply them to Large Scale RL
Master Policy Gradients algorithms and Actor-Critic (AC, A2C, A3C)
Master advanced DRL algorithms like DDPG, TRPO and PPO
Define what is model-based RL, and differentiate it from planning, and what are their main algorithms and applications?